windmolenfandomcom-20200213-history
GEDCOM
| latest release version = GEDCOM 5.5 Standard + Errata Sheet | latest release date = | genre = Genealogy data exchange | container for = | contained by = | extended from = | extended to = | standard = De factoSubject: GEDCOM and Formal Standards Organizations Date: Wed, 24 Jan 1996 11:53:52 -0700 From: Bill Harten - Organization: Brigham Young University "why wasn't GEDCOM developed through a formal standards organization?..."Thus GEDCOM was born as a deliberate, de facto standard, to be followed only by those who felt it was in their best interest to do so. | url = gedcom.org (Historical) }} GEDCOM (an acronym standing for '''GE'nealogical 'D'ata 'COM'munication'') is a proprietary and open de facto specification for exchanging genealogical data between different genealogy software. GEDCOM was developed by The Church of Jesus Christ of Latter-day Saints (LDS Church) as an aid to genealogical research.Subject: rep: T Jenkins - open letter to GEDCOM-L - "The goal was to try and provide a standard to allow developers to provide a vehicle for their users to share genealogical conclusions and supporting evidence with others." From: "Jed R. Allen" Brigham Young University - Date: 29 Sep 1995 17:40:04 -0600 - GEDCOM-L Archives -- September 1995, week 5 (#7) A GEDCOM file is plain text (usually either ANSEL or ASCII) containing genealogical information about individuals, and meta data linking these records together. Most genealogy software supports importing from and/or exporting to GEDCOM format. However, some genealogy software programs incorporate the use of proprietary extensions to the GEDCOM format, which are not always recognized by other genealogy programs, for example the GEDCOM 5.5 EL (Extended Locations) specification.GEDCOM 5.5 EL (Extended Locations) specificationAbility to save information against places - "Support for parts of the GEDCOM 5.5EL proposal" - FHUG Wish List0000688: Support for Gedcom 5.5EL - Gramps Bugtracker In February of 2012 at the RootsTech 2012 conference, FamilySearch outlined a major new project around genealogical standards called GEDCOM X, and invited collaboration. In August of 2012 FamilySearch employee and GEDCOM X project leader Ryan Heaton dropped the claim that GEDCOM X is the new industry standard, and repositioned GEDCOM X as another FamilySearch open source project.2012-08-31 GEDCOM X: no industry standard, FamilySearch abandons GEDCOM X push, By Tamura Jones, Modern Software Experience. GEDCOM model GEDCOM uses a lineage-linked data model. This data model is based on the nuclear family and the individual. This contrasts with evidence-based models, where data are structured to reflect the supporting evidence. In the GEDCOM lineage-linked data model, all data are structured to reflect the believed reality, that is, actual (or hypothesized) nuclear families and individuals. GEDCOM file structure A GEDCOM file consists of a header section, records, and a trailer section. Within these sections, records represent people (INDI record), families (FAM records), sources of information (SOUR records), and other miscellaneous records, including notes. Every line of a GEDCOM file begins with a level number where all top-level records (HEAD, TRLR, SUBN, and each INDI, FAM, OBJE, NOTE, REPO, SOUR, and SUBM) begin with a line with level 0, while other level numbers are positive integers. Although it is theoretically possible to write a GEDCOM file by hand, the format was designed to be used with software and thus is not especially human-friendly. A GEDCOM validatorView of phpgedview's GEDCOM validator source code that can be used to validate the structure of a GEDCOM file is included as part of PhpGedView project, though it is not meant to be a standalone validator. For standalone validation you can use "The Windows GEDCOM Validator"The Windows GEDCOM Validator (dead link) or the older unmaintained GedcheckGedcheck - "uses a grammar file for the specific version of GEDCOM you want to check against." The Church of Jesus Christ of Latter-day Saints from the LDS. During 2001, The GEDCOM TestBook Project evaluated how well four popular genealogy programs conformed to the GEDCOM 5.5 standard using the Gedcheck program. Findings showed that a number of problems existed and that "The most commonly found fault leading to data loss was the failure to read the NOTE tag at all the possible levels at which it may appear."and the GenTech Testbook Project Genealogical Computing 7/1/2001 - Archive Summer 2001 Vol. 21.1 - Ancestry.com In 2005, the Genealogical Software Report Card was evaluated (by Bill Mumford who participated in the original GEDCOM Testbook Project)The Genealogical Software Report Card 2000 S W Mumford Last updated March 2005 and included testing the GEDCOM 5.5 standard using the Gedcheck program.Reviews from the NGS Newsmagazine and its Predecessors. - Test Result are in the PDF's Example The following is a sample GEDCOM file. The first column indicates an indentation level. The header (HEAD) includes the source program and version (Reunion, V8.0), the GEDCOM version (5.5), and the character encoding (MACINTOSH), which is invalid, as according to the GEDCOM 5.5 specification; valid choices are ANSEL, UNICODE or ASCII. The individual records (INDI) define Bob Cox (ID 1—@I1@), Joann Para (ID 2), and Bobby Jo Cox (ID 3). The family record (FAM) links the husband (HUSB), wife (WIFE), and child (CHIL) by their ID numbers. Versions The current version of the specification is GEDCOM 5.5, which was released on 12 January 1996. A subsequent draft GEDCOM 5.5.1GEDCOM 5.5.1 draft specification was issued in 1999, introducing nine new tags, including WWW, EMAIL and FACT, and adding UTF-8 as an approved character encoding. This draft has not been formally approved, but its provisions have been adopted in some part by a number of genealogy programsGED-GEN is based on GEDCOM version 5.5.1 (draft), dated 2 October 1999. The following record types are parsed: header, individual, family, notes, source, and repository. However not all elements within these records are processed. - Specifications - GED-GEN Introduction0000688: Support for Gedcom 5.5EL(0008068) romjerome (developer) 2009-01-25 06:13 - "Note : GRAMPS 3.0.x supports a part of GEDCOM 5.5.1 on export, which is not supported by most programs" - Gramps Bugtracker"MyBlood supports the GEDCOM 5.5 and 5.5.1 file format." - MyBlood Support - Forum, FAQ, Know Problems and is used by FamilySearch.orgFamilySearchtoGEDCOMMapping.doc - FamilySearch XML to GEDCOM Mapping - .."The GEDCOM v5.5.1 (http://www.phpgedview.net/ged551-5.pdf) specification was used as the target for the GEDCOM side." While PAF 5.2 does support GEDCOM 5.5, PAF 5.2 uses UTF-8 as its internal character set, a feature which was introduced in the GEDCOM 5.5.1 draft, and can output a UTF-8 GEDCOM.Personal Ancestral File 5.2 and PAF Companion 5.4 - Software Version Changes Release 5.0.1.4, 22 December 2000 - "10.GEDCOM improvements: Table:Destination:PAF 5 GEDCOM Version:5.5 Character Set:UTF-8Personal Ancestral File 5.1 - "Also noted in a second test was the use of four tags from a later draft version of the Gedcom specification, FONE (phonetic name), ROMN (romanized name), EMAIL (e-mail), and _UID" Jan/Feb 2002 NGS Newsmagazine On 23 January 2002, a draft (beta) version of GEDCOM 6.0 was released for developer study only, as it was not a complete specification, and developers were recommended to not begin implementation in their software. For example, descriptions of the meaning and expected contents of tags were not included.GEDCOM 6.0 specification GEDCOM 6.0 was to be the first version to store data in XML format, and was to change the preferred character set from ANSEL to Unicode. Lineage-linked GEDCOM is the deliberate de facto common denominator. Despite version 5.5 of the GEDCOM standard first being published in 1996, many genealogical software suppliers have yet to support the feature of multilingual Unicode text (instead of the ANSEL character set) introduced with that version of the specification. Uniform use of Unicode would allow for the usage of international character sets. An example is the storage of East Asian names in their original Chinese, Japanese and Korean (CJK) characters, without which they could be ambiguous and of little use for genealogical or historical research. Release history Limitations Support for multi-person events and sources A GEDCOM file can contain information on events such as births, deaths, census records, ship's records, marriages, etc.; a general rule of thumb is that an event is something that took place at a specific time, at a specific place (even if the time & place are not known). GEDCOM files can also contain attributes such as physical description, occupation, and total number of children; unlike events, attributes generally cannot be associated with a specific time or place. The GEDCOM specification requires that each event or attribute is associated with exactly one individual or family.GEDCOM Standard 5.5, pp. 26-27. This causes redundancy for events such as census records where the actual census entry often contains information on multiple individuals. In the GEDCOM file, for census records a separate census "CENS" event must be added for each individual referenced. Some genealogy programs, such as GRAMPS and The Master Genealogist, have elaborate database structures for sources that are used, among other things, to represent multi-person events. When databases are exported from one of these programs to GEDCOM, these database structures cannot be represented in GEDCOM due to this limitation, with the result that the event or source information including all of the relevant citation reference information must be duplicated each place that it is used. This duplication makes it difficult for the user to maintain the information related to sources. In the GEDCOM specification, events that are associated with a family such as marriage information is only stored in a GEDCOM once, as part of the family (FAM) record, and then both spouses are linked to that single family record. Ambiguity in the specification The GEDCOM specification was made purposefully flexible to support many ways of encoding data, particularly in the area of sources. This flexibility has led to a great deal of ambiguity, and has produced the side effect that some genealogy programs which import GEDCOM do not import all of the data from a file. Support for varying definitions of families and relationships GEDCOM does not explicitly support data representation of many types of close interpersonal relationships, such as same-sex marriages, domestic partnerships, cohabitation, polyamory, polygamy or incest, but such relationships and any other can be represented using the ASSO tag. Ordering of events that do not have dates The GEDCOM specification does not offer explicit support for keeping a known order of events. In particular, the order of relationships (FAMS) for a person and the order of the children within a relationship (FAM) can be lost. In many cases the sequence of events can be derived from the associated dates. But dates are not always known, in particular when dealing with data from centuries ago. For example, in the case that a person has had two relationships, both with unknown dates, but from descriptions it is known that the second one is indeed the second one. The order in which these FAMS are recorded in GEDCOM's INDI record will depend on the exporting program. In Aldfaerhttp://www.aldfaer.nl for instance, the sequence depends on the ordering of the data by the user (alphabetical, chronological, reference, etc.). The proposed XML GEDCOM standard does not address this issue either. Lesser-known features GEDCOM has many features that are not commonly used, and hence are unknown to some people. Some software packages do not support all the features that the GEDCOM standard allows. Multimedia The GEDCOM standard does support the inclusion of multimedia objects (for example, photos of individuals).GEDCOM Standard 5.5, p. 28. Such multimedia objects can be either included in the GEDCOM file itself (called the "embedded form") or in an external file where the name of the external file is specified in the GEDCOM file (called the "linked form"). Embedding multimedia directly in the GEDCOM file makes transmission of data easier, in that all of the information (including the multimedia data) is in one file, but the resulting file can be enormous. Linking multimedia keeps the size of the GEDCOM file under control, but then when transmitting the file, the multimedia objects must either be transmitted separately or archived together with the GEDCOM into one larger file. Support for embedding media directly was dropped in the draft 5.5.1 standard.Draft GEDCOM Standard 5.5.1, p. 6. Conflicting information The GEDCOM standard does allow for the specification of multiple opinions or conflicting data, simply by specifying multiple records of the same type. For example, if an individual's birth date was recorded as 10 January 1800 on the birth certificate, but 11 January 1800 on the death certificate, two BIRT records for that individual would be included, the first with the 10 January 1800 date and giving the birth certificate as the source, and the second with the 11 January 1800 date and giving the death certificate as the source. The preferred record is usually listed first. This example encoded in GEDCOM might look like this: 0 @I1@ INDI 1 NAME John /Doe/ 1 BIRT 2 DATE 10 JAN 1800 2 SOUR @S1@ 3 DATA 4 TEXT Transcription from birth certificate would go here 3 NOTE This birth record is preferred because it comes from the birth certificate 3 QUAY 2 1 BIRT 2 DATE 11 JAN 1800 2 SOUR @S2@ 3 DATA 4 TEXT Transcription from death certificate would go here 3 QUAY 2 Conflicting data may also be the result of user errors. The standard does not specify in any way that the contents must be consistent. A birth date like "10 APR 1819" might mistakenly have been recorded as "10 APR 1918" long after the person's death. The only way to reveal such inconsistencies is by rigorous validation of the content data. Internationalization The GEDCOM standard supports internationalization in several ways. First, newer versions of the standard allow data to be stored in Unicode (or, more recently, UTF-8), so text in any language can be stored.GEDCOM Standard 5.5, p. 45. Secondly, in the same way that you can have multiple events on a person, GEDCOM allows you to have multiple names for a person,GEDCOM Standard 5.5, p. 27. so names can be stored in multiple languages (although there is no standardized way to indicate which instance is in which language). Finally, in the latest draft version (5.5.1, not yet in widespread use), the NAME field also supports a phonetic variation (FONE) and a romanized variation (ROMN) of the name.GEDCOM Draft 5.5.1, p. 38 GEDCOM X In February 2012 at the RootsTech 2012 conference, FamilySearch outlined a major new project around genealogical standards called GEDCOM X, and invited collaboration. It will include software developed under the Apache open source license. It includes data formats that facilitate basing family trees on sources and records (both physical artifacts and digital artifacts), support for sharing and linking data online, and an API. Alternatives to GEDCOM Commsoft, the authors of the RootsCommSoft to Return? Dick Eastman Online 3/14/2001 - Archive - Ancestry.com series of genealogy software and Ultimate Family Tree, defined a version called Event-Oriented GEDCOM (also known as "Event GEDCOM" and originally called InterGEDRootsWeb: TMG-L [TMG InterGED/Event GEDCOM] Date: Fri, 15 Feb 2002 13:33:18 -0700), which included events as first class (zero-level) items. Although it is event based, it is still a model built on assumed reality rather than evidence. Event GEDCOM was more flexible, as it allowed some separation between believed events and the participants. However, Event GEDCOM was not widely adopted by other developers due to its semantic differences. With Roots and Ultimate Family Tree no longer available, very few people today are using Event GEDCOM. GRAMPS XML is an XML-based open format created by the open source genealogy project GRAMPS and used also by PhpGedView. See also *FamilySearch **Ancestral File Number **International Genealogical Index *GENDEX - Genealogical index *Genealogical numbering systems *GNTP - Genealogy Network Transfer Protocol *Tiny Tafel Format - encoded "ancestor table" References External links * Specifications (from the FamilySearch website) ** GEDCOM 5.5 Standard (PDF) ** GEDCOM 5.5.1 Draft (PDF) ** GEDCOM 5.5 Standard (Executable file in Envoy format) ** GEDCOM 5.5 specification (Paul McBride's HTML version with corrections added from the GEDCOM 5.5 Errata Sheet) ** GEDCOM 5.5.1-5 Draft (Unmodified) - PhpGedView ** GEDCOM Specification Future Direction (1999-07-07)(PDF) ** Draft Specification for GEDCOM XML 6.0 Version 2002-12-06 (PDF) * Proprietary extension: GEDCOM 5.5 EL (Extended Locations) specification. *Commentary ** Overview of GEDCOM and its uses on Encyclopedia of Genealogy ** Cyndi's List — GEDCOM *** Many tools exist to convert GEDCOM files to HTML pages see Cyndi's List — GEDCOM to HTML Conversion ** Mapping GEDCOM to XML, an article with example software in the C# programming language. ** on LDS Church's Adoption of the XML Standard * Genealogical Data Communications Specs - List Archives on discussion of the specification * Gedcom 5.5 standard in Windows Help File format From Wikipedia, the free encyclopedia Category:GEDCOM import project